Projet EU DEMERSTEM http://pescao-demerstem.org/ Projet EU DEMERSTEM http://pescao-demerstem.org/

Introduction

The GPSMonitoring packages, developped during the EU/ECOWAS DEMERSTEM project, aims to facilitate GPS dataset managment for artisanal fisheries monitoring. Especially, it will allow easy filtering process (GPS.curation) to keep only tracks on sea that will only mix trajectory to the fishing area and fishing positions.

Additional function will :

To install this package, use following R command : devtools::install_github(‘polehalieutique/GPSMonitoring’)

loading Gpx files

A function GPX.load is provided to load one GPS file or a list of GPS files included in a given directory. A dataset is also provided within the package itself (DEMERSTEM_GPS.Rdata) and contains a set of about 600 000 GPS positions from Kamsar Port Néné (Guinée) fishing monitoring System. This dataset is provided through a data paper (DEMERSTEM project : dx.doi.org)

The Guinean dataset will be used to illustrate package capabilities in the present document. When loading the DEMERSTEM_gps.Rdata file, located in the GPSMonitoring package data directory, a GPSdataset dataframe is loaded.

data(GPSdataset)

GPSdataset<-GPSdataset %>% filter(date_heure<"2019-09-30 20:20:53 CEST")

For the vignette, i will decrease the dataset decreasing the timeframe to the 30-09-2019.

The original dataset is then transform in a sf object. We also add original GPS filename of the dataset. Once the dataframe is becoming a sf object, we can plot it using geom_sf function.

GPSdataset %>% mutate(filename=paste(code_village,code_engin,code_pecheur,'.gpx',sep='_')) %>% 
arrange (filename) %>% dplyr::distinct(code_village,code_engin,code_pecheur,filename) %>% 
dplyr::mutate(track_fid=row_number(),track_seg_id=track_fid) %>% 
inner_join(GPSdataset) %>%  
group_by(filename) %>% arrange (filename,date_heure) %>% dplyr::mutate(track_seg_point_id = row_number()) %>% 
  dplyr::rename(time=date_heure) %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326,remove=FALSE)->gps.all
## Joining, by = c("code_village", "code_engin", "code_pecheur")
ggplot(gps.all)+geom_sf(aes(color=filename),size=0.2)

A map of the dataset area coastal shape is also include in the package (data/fond.Rdata)

data(fond)

ggplot(gps.all)+geom_sf(data=fond)+geom_sf(aes(color=filename),size=0.2)

Data curation

Data curation will include data filtering, Fishing trip information and frequency standardization.

Area Of Interest and Area of Non Interest

Once the data are loaded we have a data curation process : * first step we define the Area Of Interest (emprise) and we will keep all the positions within this AOI (pol.extent) * Second step we can delete positions included in such Area of non interest. Especially we could consider the port as an area of interest. Considering the the GPS is not switch off when the boat is not active (in the anchor) we want to avoid to take position within a perimter of X meters around the port into account. So we want to exlude the part of the dataset that is within this exclude area that could be defined by the port location and a perimeter.

emprise<-matrix(c(-17,11,-14,11,-14,9,-15,9,-17,11),ncol=2, byrow=TRUE)
pol.extent <-st_as_sf(st_sfc(st_polygon(list(emprise))))
st_crs(pol.extent) = 4326

ports<-data.frame(code=c('KPN'),long=c(-14.61),lat=c(10.6789))
ports.sf<- st_as_sf(ports,coords=c("long","lat"),crs=4326)
exclude<-st_as_sf(st_union(st_buffer(ports.sf,1000)))

ggplot(exclude)+geom_sf(data=pol.extent)+geom_sf(fill='red')

gps.all.cur<-GPS.curation(gps.all,extent=pol.extent,exclude=exclude)
## [1] "input number of position :271600"
## [1] "output number of position :269378"
g1<-ggplot(gps.all)+geom_sf()+geom_sf(data=exclude,fill=rgb(0.8,0.11,0.1,0.5))+
  geom_sf(data=pol.extent,fill=rgb(0.1,0.8,0.1,0.5))

g2<-ggplot(gps.all.cur)+geom_sf()+geom_sf(data=exclude,fill=rgb(0.8,0.11,0.1,0.5))+
  geom_sf(data=pol.extent,fill=rgb(0.1,0.8,0.1,0.5))

ggarrange(g1,g2)

A second way to define the AOI is to use the create.extent function that will display a map of the area and the user will define manually the area of interest.

Using this tool, you can create a polygon that will define the extent of GPS position to take into account.The interface is open with a satellite map of the area and a polygon which is the input parameter. In the folllowing exemple the input parameter is the convexhul of the dataset.

Include video of pol.extent definition usingcreate.extent function.

pol.extent<-create.extent(st_convex_hull(st_union(gps.all)))
data(emprise)

ggplot(emprise)+geom_sf()+
  ggtitle("More precise pol.extent created using create.extent function")

pol.extent<-emprise

gps.all.cur<-GPS.curation(gps.all,extent=pol.extent,exclude=exclude)
## [1] "input number of position :271600"
## [1] "output number of position :267046"
g1<-ggplot(gps.all)+geom_sf()+geom_sf(data=exclude,fill=rgb(0.8,0.11,0.1,0.5))+
  geom_sf(data=pol.extent,fill=rgb(0.1,0.8,0.1,0.5))

g2<-ggplot(gps.all.cur)+geom_sf()+geom_sf(data=exclude,fill=rgb(0.8,0.11,0.1,0.5))+
  geom_sf(data=pol.extent,fill=rgb(0.1,0.8,0.1,0.5))

ggarrange(g1,g2)

Cut trajectory Fishing trips

We considered that if there is more than n (minutes) between 2 position, we are starting a new traject, a new fihing trip.

The id of a GPS record is the filemename

limit<-600*120 #2 hours between two point and we consider a new traject
 #limit<-240
head(gps.all.cur)
## Simple feature collection with 6 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -14.6416 ymin: 10.6178 xmax: -14.621 ymax: 10.6641
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 13
##   code_village code_en…¹ code_…² filen…³ track…⁴ track…⁵ time                longi…⁶ latit…⁷ track…⁸           geometry
##   <chr>        <chr>     <chr>   <chr>     <int>   <int> <dttm>                <dbl>   <dbl>   <int>        <POINT [°]>
## 1 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:02:18   -14.6    10.7       3  (-14.621 10.6641)
## 2 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:07:18   -14.6    10.7       4 (-14.6251 10.6546)
## 3 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:12:18   -14.6    10.6       5   (-14.63 10.6457)
## 4 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:17:18   -14.6    10.6       6 (-14.6341 10.6366)
## 5 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:22:18   -14.6    10.6       7  (-14.6376 10.627)
## 6 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:27:18   -14.6    10.6       8 (-14.6416 10.6178)
## # … with 2 more variables: `_leaflet_id` <dbl>, feature_type <chr>, and abbreviated variable names ¹​code_engin,
## #   ²​code_pecheur, ³​filename, ⁴​track_fid, ⁵​track_seg_id, ⁶​longitude, ⁷​latitude, ⁸​track_seg_point_id
GPS.data<-gps.all.cur
gps.all.cur_traj<-GPS.add_traj_number(gps.all.cur,limit)  
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
gps.all.cur_traj %>% mutate(x = sf::st_coordinates(.)[,1],
                y = sf::st_coordinates(.)[,2])->gps.all.cur_traj


unique(gps.all.cur_traj$track_fid)
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  18  19  20  21  22  23  25  26  27  28  29  30  31  32
##  [31]  33  34  35  36  37  39  40  41  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64
##  [61]  65  66  67  68  69  70  71  72  73  74  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95
##  [91]  97  99 100 101 102 103 104 105 106 107 108 109 110 111 113 114 115 116 117 118 119 120 122 123 124 126 127 128 129 130
## [121] 131 132 133 134 135 136 137 139 140 141 142 143 144 145
unique(gps.all.cur_traj[gps.all.cur_traj$track_fid==2,]$no_trajet)
## [1] 2
head(gps.all.cur_traj)
## Simple feature collection with 6 features and 16 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -14.6465 ymin: 10.6091 xmax: -14.6251 ymax: 10.6546
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 17
##   code_village code_en…¹ code_…² filen…³ track…⁴ track…⁵ time                longi…⁶ latit…⁷ track…⁸           geometry
##   <chr>        <chr>     <chr>   <chr>     <dbl>   <int> <dttm>                <dbl>   <dbl>   <int>        <POINT [°]>
## 1 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:07:18   -14.6    10.7       4 (-14.6251 10.6546)
## 2 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:12:18   -14.6    10.6       5   (-14.63 10.6457)
## 3 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:17:18   -14.6    10.6       6 (-14.6341 10.6366)
## 4 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:22:18   -14.6    10.6       7  (-14.6376 10.627)
## 5 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:27:18   -14.6    10.6       8 (-14.6416 10.6178)
## 6 KPN          FMCl      0003    KPN_FM…       1       1 2019-04-14 10:32:18   -14.6    10.6       9 (-14.6465 10.6091)
## # … with 6 more variables: `_leaflet_id` <dbl>, feature_type <chr>, no_trajet <dbl>, duree <dbl>, x <dbl>, y <dbl>, and
## #   abbreviated variable names ¹​code_engin, ²​code_pecheur, ³​filename, ⁴​track_fid, ⁵​track_seg_id, ⁶​longitude, ⁷​latitude,
## #   ⁸​track_seg_point_id

Standardize duration between 2 position

As we can see here, the ping frequency is not constant. So we decide to recalibrate all the trajectory using a constant frequency of dt=300 s (5 minutes). The package ADEhabitat is used.

ggplot(filter(R.gps.all.cur_traj,duree<350))+geom_histogram(aes(x=duree))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

projcrs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
R.gps.all.cur_traj <- st_as_sf(x = R.gps.all.cur_traj,
               coords = c("x", "y"),
               crs = projcrs)



ggplot(R.gps.all.cur_traj)+geom_sf(aes(color=no_trajet))+ggtitle(paste("Tracks redistribute in a ",step_dt," period"))

## Join Observation data

Once the GPS pings are ready, we have now to predict fishing event. First we have to observe fishing events and to flag them on GPS positions.

The first set of observed data could be provided using on board observers. The datset of observed fishing event will be composed of start time and end time of fishing event by observe fishermen.

data(Observed_FO)
head(Observed_FO) %>% kable()
X code_village code_engin code_pecheur start_op end_op fishing
1 KPN FMCl 3 2019-04-14 12:37:18 2019-04-14 16:27:18 active
2 KPN FMCl 3 2019-04-14 17:47:18 2019-04-15 09:06:18 active
3 KPN FMCl 3 2019-04-15 18:02:18 2019-04-15 22:03:18 active
4 KPN FMCl 3 2019-04-15 22:52:18 2019-04-16 05:35:18 active
5 KPN FMCl 3 2019-04-16 17:45:18 2019-04-17 02:00:18 active
6 KPN FMCl 3 2019-05-16 17:03:10 2019-05-16 22:43:10 active

We thus need to join observed fishing event to GPS dataset using (code_village,code_engin,code_pecheur) and the date of the GPS position that could be rely to start and end fishing operation.

R.gps.all.cur_traj %>% tidyr::separate(filename,c('code_village','code_engin','code_pecheur'),remove=FALSE,sep='_') %>% 
  mutate(code_pecheur=as.numeric(code_pecheur)) %>%  mutate(longitude = sf::st_coordinates(.)[,1],
                                             latitude = sf::st_coordinates(.)[,2])->R2
## Warning: Expected 3 pieces. Additional pieces discarded in 100960 rows [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
## 17, 18, 19, 20, ...].
#add id number for each position of a traject
R2$id<-(R2 %>% group_by(no_trajet) %>% mutate(id=row_number()) %>% select (id))$id

R2$date<-as.POSIXct(R2$date)
R2 %>% mutate(obs=as.integer(1),activity='UK')->R2
head(R2)
## Simple feature collection with 6 features and 20 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -14.6465 ymin: 10.6091 xmax: -14.6251 ymax: 10.6546
## CRS:           +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
##                      date      dx      dy        dist  dt        R2n abs.angle   rel.angle no_trajet track_fid
## 1...1 2019-04-14 10:07:18 -0.0049 -0.0089 0.010159724 300 0.00000000 -2.074071          NA         1         1
## 2...2 2019-04-14 10:12:18 -0.0041 -0.0091 0.009980982 300 0.00010322 -1.994107  0.07996368         1         1
## 3...3 2019-04-14 10:17:18 -0.0035 -0.0096 0.010218121 300 0.00040500 -1.920403  0.07370363         1         1
## 4...4 2019-04-14 10:22:18 -0.0040 -0.0092 0.010031949 300 0.00091801 -1.980924 -0.06052022         1         1
## 5...5 2019-04-14 10:27:18 -0.0049 -0.0087 0.009984989 300 0.00162649 -2.083731 -0.10280767         1         1
## 6...6 2019-04-14 10:32:18 -0.0051 -0.0083 0.009741663 300 0.00252821 -2.121779 -0.03804747         1         1
##                 filename code_village code_engin code_pecheur duree                 geometry longitude latitude id obs
## 1...1 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6251 10.6546)  -14.6251  10.6546  1   1
## 2...2 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300   POINT (-14.63 10.6457)  -14.6300  10.6457  2   1
## 3...3 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6341 10.6366)  -14.6341  10.6366  3   1
## 4...4 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300  POINT (-14.6376 10.627)  -14.6376  10.6270  4   1
## 5...5 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6416 10.6178)  -14.6416  10.6178  5   1
## 6...6 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6465 10.6091)  -14.6465  10.6091  6   1
##       activity
## 1...1       UK
## 2...2       UK
## 3...3       UK
## 4...4       UK
## 5...5       UK
## 6...6       UK
for (compteur in seq(1,length(Observed_FO$X)))
{  
#compteur<-1  
R2<-R2 %>% mutate(activity=case_when(
code_village==Observed_FO[compteur,]$code_village & code_engin==Observed_FO[compteur,]$code_engin  & code_pecheur==Observed_FO[compteur,]$code_pecheur & date>Observed_FO[compteur,]$start_op & date<Observed_FO[compteur,]$end_op ~ 'active',
TRUE ~ activity),obs=case_when(
code_village==Observed_FO[compteur,]$code_village & code_engin==Observed_FO[compteur,]$code_engin  & code_pecheur==Observed_FO[compteur,]$code_pecheur & date>Observed_FO[compteur,]$start_op & date<Observed_FO[compteur,]$end_op ~ Observed_FO[compteur,]$X,
TRUE ~ obs))
}
compteur<-40
liste_trajet_avec_obs<-R2 %>% st_drop_geometry()%>% filter(activity=='active')%>% dplyr::distinct(no_trajet)                           




ggplot(filter(R2,no_trajet %in% liste_trajet_avec_obs$no_trajet))+geom_sf(aes(color=as.factor(activity)),lwd=0.1)+ggtitle("Tracks redistribute in a 60s period with observation")

Modelization of fishing event using observed data

First an overview of observed data

Observed data are flaging GPS position to be active or not (ie. Fishing or not). Thus we can look at fishing or non fishing event and associated variable (speed, angle, R2n).

R2n is the the squared net displacement between the current relocation and the first relocation of the trajectory. We use distance between two point which is the same as the speed as frequency is constant.

liste_trajet_avec_obs<-R2 %>% st_drop_geometry()%>% filter(activity=='active')%>% dplyr::distinct(no_trajet)     

R2_avec_obs<-R2 %>%  filter(no_trajet %in% liste_trajet_avec_obs$no_trajet) 


R2_avec_obs %>% st_drop_geometry() %>% 
  group_by(code_engin,activity,dist) %>% dplyr::summarise(frequence=n()) %>%  ggplot()+
  geom_smooth(aes(x=dist,y=frequence,col=activity))+
  facet_wrap(~code_engin,scales = "free")+ggtitle('Fishing event ~ distance(=speed)')
## `summarise()` has grouped output by 'code_engin', 'activity'. You can override using the `.groups` argument.
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 9 rows containing non-finite values (`stat_smooth()`).

bb<-st_bbox(R2_avec_obs)

ggplot(R2_avec_obs)+geom_sf(data=fond)+geom_sf(aes(color=activity),alpha=0.45,lwd=1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")+xlim(as.numeric(bb$xmin),as.numeric(bb$xmax))+ ylim(as.numeric(bb$ymin), as.numeric(bb$ymax))+facet_wrap(~code_engin)

unique(R2_avec_obs$code_engin)
## [1] "FMCl" "FMCy" "FMD"  "FMEg" "PA"   "YO"
engin_encours<-'FMCy'

head(R2_avec_obs)
## Simple feature collection with 6 features and 20 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -14.6465 ymin: 10.6091 xmax: -14.6251 ymax: 10.6546
## CRS:           +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
##                      date      dx      dy        dist  dt        R2n abs.angle   rel.angle no_trajet track_fid
## 1...1 2019-04-14 10:07:18 -0.0049 -0.0089 0.010159724 300 0.00000000 -2.074071          NA         1         1
## 2...2 2019-04-14 10:12:18 -0.0041 -0.0091 0.009980982 300 0.00010322 -1.994107  0.07996368         1         1
## 3...3 2019-04-14 10:17:18 -0.0035 -0.0096 0.010218121 300 0.00040500 -1.920403  0.07370363         1         1
## 4...4 2019-04-14 10:22:18 -0.0040 -0.0092 0.010031949 300 0.00091801 -1.980924 -0.06052022         1         1
## 5...5 2019-04-14 10:27:18 -0.0049 -0.0087 0.009984989 300 0.00162649 -2.083731 -0.10280767         1         1
## 6...6 2019-04-14 10:32:18 -0.0051 -0.0083 0.009741663 300 0.00252821 -2.121779 -0.03804747         1         1
##                 filename code_village code_engin code_pecheur duree                 geometry longitude latitude id obs
## 1...1 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6251 10.6546)  -14.6251  10.6546  1   1
## 2...2 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300   POINT (-14.63 10.6457)  -14.6300  10.6457  2   1
## 3...3 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6341 10.6366)  -14.6341  10.6366  3   1
## 4...4 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300  POINT (-14.6376 10.627)  -14.6376  10.6270  4   1
## 5...5 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6416 10.6178)  -14.6416  10.6178  5   1
## 6...6 KPN_FMCl_0003_.gpx          KPN       FMCl            3   300 POINT (-14.6465 10.6091)  -14.6465  10.6091  6   1
##       activity
## 1...1       UK
## 2...2       UK
## 3...3       UK
## 4...4       UK
## 5...5       UK
## 6...6       UK
R2_avec_obs %>% filter(code_engin==engin_encours) %>% ggplot()+geom_line(aes(x=date,y=dist))+
  geom_point(aes(x=date,dist,col = activity))  +
  facet_wrap(~no_trajet,scale='free')+ggtitle(paste("For gear ",engin_encours," Fishing event ~ distance (=speed)",sep=""))
## Warning: Removed 1 row containing missing values (`geom_line()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).

R2_avec_obs %>% filter(code_engin==engin_encours) %>% ggplot()+geom_line(aes(x=date,y=R2n))+
  geom_point(aes(x=date,R2n,col = activity))  +
  facet_wrap(~no_trajet,scale='free')+ggtitle(paste("For gear ",engin_encours," Relation observation et R2n",sep=''))

R2_avec_obs %>% filter(code_engin==engin_encours) %>% ggplot()+geom_line(aes(x=date,y=rel.angle))+
  geom_point(aes(x=date,rel.angle,col = activity))  +
  facet_wrap(~no_trajet,scale='free')+ggtitle(paste("For gear ",engin_encours,"  Relation observation et rel.angle",sep=''))
## Warning: Removed 2 rows containing missing values (`geom_line()`).
## Warning: Removed 29 rows containing missing values (`geom_point()`).

GLM model between fishing event and speed or other trajectory parameters

Model definition

observation<-'activity'

#On teste avec les 3 paramètres 
gear.glm<-model.traj.glm(filter(R2,code_engin==engin_encours),observation='activity',form='dist+abs.angle')
## [1] "observation2~dist+abs.angle"
summary(gear.glm)
## 
## Call:
## glm(formula = formula, data = traj2)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.14771  -0.12709  -0.10865  -0.08811   0.93608  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.127874   0.006191  20.654  < 2e-16 ***
## dist        -11.553706   1.923562  -6.006 2.09e-09 ***
## abs.angle    -0.008435   0.003123  -2.701  0.00694 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.09465995)
## 
##     Null deviance: 332.15  on 3472  degrees of freedom
## Residual deviance: 328.47  on 3470  degrees of freedom
##   (26 observations effacées parce que manquantes)
## AIC: 1673.5
## 
## Number of Fisher Scoring iterations: 2
summary(gear.glm)
## 
## Call:
## glm(formula = formula, data = traj2)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.14771  -0.12709  -0.10865  -0.08811   0.93608  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.127874   0.006191  20.654  < 2e-16 ***
## dist        -11.553706   1.923562  -6.006 2.09e-09 ***
## abs.angle    -0.008435   0.003123  -2.701  0.00694 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.09465995)
## 
##     Null deviance: 332.15  on 3472  degrees of freedom
## Residual deviance: 328.47  on 3470  degrees of freedom
##   (26 observations effacées parce que manquantes)
## AIC: 1673.5
## 
## Number of Fisher Scoring iterations: 2
plot(gear.glm)

Prédiction using GLM model

R2.pred<-glm.predict(filter(R2,code_engin==engin_encours),gear.glm,seuil=0.5)

p1<-ggplot(R2.pred)+geom_sf(aes(color=as.factor(predict.glm)),alpha=0.45,lwd=0.1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")
p2<-ggplot(filter(R2_avec_obs,code_engin==engin_encours))+geom_sf(aes(color=activity),alpha=0.45,lwd=0.1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")
ggarrange(p1,p2,ncol=1)

Obvisouly the Observed Fishing activities does not seem to be really efficient to produce a tight prediction

How to improve observed values

We can review observed fishing activities using visual interface: we define a new observation column : activity_plus

If we are not confident in using on board observation, the second way to have “observed” fishing event is to qualify trajectory in active or not active segments. To do that, we will use the silicoTraj function that will help you to qualify fishing event on a set of trajectories (the first 5 trajectories here selected in the longuest trajectories)

The dataframe traj_to_observe contains the list of 5 (top_n(5)) track number.

We will use the same function silicoTraj in two different modes sequencially : * First in mode speed where i qualify GPS ping using speed * Second one where i can qualify GPS ping directly on a map

R2 %>% st_drop_geometry() %>% dplyr::filter(code_engin==engin_encours) %>% dplyr::group_by(no_trajet) %>% 
  dplyr::summarize(nb_positions=n()) %>% dplyr::arrange(desc(nb_positions)) %>% dplyr::top_n(5) -> traj_to_observe

R2 %>% filter(no_trajet %in% traj_to_observe$no_trajet) %>% 
  ggplot()+geom_sf(aes(color=as.factor(no_trajet)))+facet_wrap(~no_trajet)

#I create the new column and set values to Unknown 
R2$activity_plus<-'UK'

for (i in traj_to_observe$no_trajet)
{
  R2<-track_replace(R2,silicoTraj(filter(R2,no_trajet==i),mode='speed'))
}


for (i in traj_to_observe$no_trajet)
{
  R2<-track_replace(R2,silicoTraj(filter(R2,no_trajet==i),mode='map'))
}

With these new set of data i try to improve the model

gear.glm.plus<-model.traj.glm(filter(R2,code_engin==engin_encours),observation="activity_plus",form= "dist+abs.angle")
## [1] "observation2~dist+abs.angle"
summary(gear.glm.plus)
## 
## Call:
## glm(formula = formula, data = traj2)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.04042  -0.03348   0.01827   0.08290   0.45586  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.035e+00  2.207e-03  469.17   <2e-16 ***
## dist        -1.259e+02  6.622e-01 -190.05   <2e-16 ***
## abs.angle   -1.869e-02  1.097e-03  -17.03   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.02639266)
## 
##     Null deviance: 1155.92  on 7563  degrees of freedom
## Residual deviance:  199.55  on 7561  degrees of freedom
##   (75 observations effacées parce que manquantes)
## AIC: -6021.9
## 
## Number of Fisher Scoring iterations: 2
plot(gear.glm.plus)

R2.pred.plus<-glm.predict(filter(R2,code_engin==engin_encours),gear.glm.plus,seuil=0.5)


p1<-ggplot(R2.pred.plus)+geom_sf(aes(color=as.factor(predict.glm)),alpha=0.45,lwd=1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")
p2<-ggplot(filter(R2.pred.plus,code_engin==engin_encours))+geom_sf(aes(color=activity_plus),alpha=0.45,lwd=1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")
ggarrange(p1,p2,ncol=1)

Conclusion, on board observers data will be trashed and we can easily have better qualification using less energy, less funds but more expert knowledge in fishing events.

Add a new covariable : the number of position in the same spatial and temporal windows

R2test_retour<-all.add.nb.point(R2.pred.plus,r=2000,temp_windows=20)
## Joining, by = c("no_trajet", "id")
ggplot(filter(R2test_retour,no_trajet==traj_to_observe$no_trajet[1]))+geom_sf(aes(color=(circle2000)))

ggplot(filter(R2test_retour,no_trajet==traj_to_observe$no_trajet[1]))+geom_point(aes(x=id,y=circle2000,color=activity_plus))

Using this new variable, we try to improve the model

gear.glm.plus.nb<-model.traj.glm(filter(R2test_retour,code_engin==engin_encours),observation="activity_plus",form= "dist+abs.angle+circle2000")
## [1] "observation2~dist+abs.angle+circle2000"
summary(gear.glm.plus)
## 
## Call:
## glm(formula = formula, data = traj2)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.04042  -0.03348   0.01827   0.08290   0.45586  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.035e+00  2.207e-03  469.17   <2e-16 ***
## dist        -1.259e+02  6.622e-01 -190.05   <2e-16 ***
## abs.angle   -1.869e-02  1.097e-03  -17.03   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.02639266)
## 
##     Null deviance: 1155.92  on 7563  degrees of freedom
## Residual deviance:  199.55  on 7561  degrees of freedom
##   (75 observations effacées parce que manquantes)
## AIC: -6021.9
## 
## Number of Fisher Scoring iterations: 2
summary(gear.glm.plus.nb)
## 
## Call:
## glm(formula = formula, data = traj2)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.01301  -0.03416   0.01471   0.07579   0.47465  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.386e-01  1.321e-02  71.041  < 2e-16 ***
## dist        -1.163e+02  1.449e+00 -80.237  < 2e-16 ***
## abs.angle   -1.918e-02  1.095e-03 -17.504  < 2e-16 ***
## circle2000   2.550e-03  3.435e-04   7.421 1.29e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.02620525)
## 
##     Null deviance: 1155.92  on 7563  degrees of freedom
## Residual deviance:  198.11  on 7560  degrees of freedom
##   (75 observations effacées parce que manquantes)
## AIC: -6074.8
## 
## Number of Fisher Scoring iterations: 2
plot(gear.glm.plus.nb)

R2.pred.plus.nb<-glm.predict(filter(R2test_retour,code_engin==engin_encours),gear.glm.plus.nb,seuil=0.5) %>% filter(!is.na(predict.glm))


p1<-ggplot(R2.pred.plus.nb)+geom_sf(aes(color=as.factor(predict.glm)),alpha=0.45,lwd=1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")
p2<-ggplot(filter(R2.pred.plus.nb,code_engin==engin_encours))+geom_sf(aes(color=activity_plus),alpha=0.45,lwd=1)+
    scale_fill_viridis_c(option = "plasma", trans = "sqrt")
ggarrange(p1,p2,ncol=1)

The AIC of his model (same set of data but new variable in the formula) is better (-4300 instead of -3800). We could thus consider that adding thist circle2000 co variable imporve the model

Random Forest prediction

Random forest is another algorythm which is able to predict fishing event using observed values. Methodology is describe in folowing article :

[Estimating fishing effort in small-scale fisheries using GPS tracking data and random forests] (https://dx.doi.org/10.1016/j.ecolind.2020.107321).

model.traj.RF and RF.predict function are similar to model.traj.glm and gm.predict function.

#To select some 
form <-"dist+abs.angle+circle2000"
mod.RF<-model.traj.RF(traj=R2.pred.plus.nb ,observation='activity_plus',form=form)
## [1] "activity_plus~dist+abs.angle+circle2000"
R2.pred.plus.nb.RF<-RF.predict(traj=R2.pred.plus.nb,mod.RF)
## Joining, by = c("date", "dx", "dy", "dist", "dt", "R2n", "abs.angle", "rel.angle", "no_trajet", "track_fid", "filename",
## "code_village", "code_engin", "code_pecheur", "duree", "longitude", "latitude", "id", "obs", "activity", "activity_plus",
## "predict.glm.int", "predict.glm", "circle2000")
ggplot(R2.pred.plus.nb.RF)+geom_sf(aes(color=predict.RF),lwd=0.1)+geom_sf(data=st_crop(fond,st_bbox(R2.pred.plus.nb.RF)))+
  scale_color_manual(values = c("active" = "lightgreen","UK"="orange")) +
  ggtitle("Map of predict activities using RF method")

ggplot(R2.pred.plus.nb.RF)+geom_sf(aes(color=predict.glm),lwd=0.1)+geom_sf(data=st_crop(fond,st_bbox(R2.pred.plus.nb.RF)))+
  scale_color_manual(values = c("active" = "lightgreen","UK"="orange")) +
  ggtitle("Map of predict activities using GLM method")

Quality of models

Quality of models could be describe using SENSITIVITY, SPECIFICITY AND ACCURACY indicators.

Quality indicators specifications

The qualidy.ind function of the GPSMonitoring package aims to calculate those indicators using three parameters : the dataset, the name of the column for observed value and the column name for predict values.

qual.RF<-quality.ind(R2.pred.plus.nb.RF,col.activity='activity_plus',col.predict='predict.RF')

qual.glm<-quality.ind(R2.pred.plus.nb.RF,col.activity='activity_plus',col.predict='predict.glm')

qual.glm[[1]] %>% mutate(model='glm') %>% select(model,accuracy,sensitivity,specificity) %>% pivot_longer(cols=!model)->glm.ind
qual.RF[[1]] %>% mutate(model='RF') %>% select(model,accuracy,sensitivity,specificity) %>% pivot_longer(cols=!model)->RF.ind

rbind(glm.ind,RF.ind) %>% 
  ggplot()+geom_bar(aes(x=name,y=value,fill=model),stat='identity',position='dodge')+
  ggtitle("Quality comparaison between two last models GLM/RF")

Regular grid end fishing event

engin_encours<-'FMCy'

#Remettre avec un shapefile dans data
#fond<-st_read(con,query="select st_intersection(st_buffer(geom,0),ST_GeomFromText('POLYGON((-18 13,-13 13,-13 7.94,-18 7.94,-18 13))',4326)) as geom 
#                      from communes_uni")
data(grid)

ggplot(grid)+geom_sf(color=NA)+geom_sf(data=R2.pred.plus.nb.RF)

 st_join(grid,filter(R2.pred.plus.nb.RF,predict.RF=='active'),left=FALSE) %>% group_by(code_engin,id.x) %>% dplyr::summarize(total=n()) %>% 
   ggplot()+ geom_sf(aes(fill=total),lwd=0)+geom_sf(data=fond)+facet_wrap(~code_engin)+ scale_fill_continuous(trans = 'reverse')+
  xlim(as.numeric(bb$xmin),as.numeric(bb$xmax))+ ylim(as.numeric(bb$ymin),as.numeric(bb$ymax))+
   ggtitle("Map of fishing event density")
## `summarise()` has grouped output by 'code_engin'. You can override using the `.groups` argument.

st_join(grid,R2.pred.plus.nb.RF,left=FALSE) %>% group_by(code_engin,id.x) %>% dplyr::summarize(total=n()) %>% 
   ggplot()+ geom_sf(aes(fill=total),lwd=0)+geom_sf(data=fond)+facet_wrap(~code_engin)+ scale_fill_continuous(trans = 'reverse')+
  xlim(as.numeric(bb$xmin),as.numeric(bb$xmax))+ ylim(as.numeric(bb$ymin),as.numeric(bb$ymax))+
  ggtitle("Map of GPS data density")
## `summarise()` has grouped output by 'code_engin'. You can override using the `.groups` argument.

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.